- 
                Notifications
    
You must be signed in to change notification settings  - Fork 47
 
[RFC] - Config is code #512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR showcases different styles for our config system, exploring the "config is code" philosophy. Implements 5 approaches: 1. OmegaConf/Hydra (YAML-based) 2. Fiddle (Google's config library) 3. Factory + dataclasses 4. Plain Python with functools.partial 5. Plain Python with dictionaries Each approach demonstrates handling of: - Lazy instantiation - Nested component instantiation - Partial application for runtime args - Config composition and overrides
ec6a892    to
    55c308d      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting up these candidates! Left some initial impressions, might not be accurate.
| inner_partial = cfg2["model"].keywords["attn_config"] | ||
| inner_partial.keywords["num_heads"] = 64 | ||
| model2 = cfg2["model"]() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definition part is OK, but its consumption is confusing when it comes to nested init
| return base | ||
| 
               | 
          ||
| cfg_variant = llama3_2_1b_large_lr() | ||
| attn_config_variant = cfg_variant["model"]["kwargs"]["attn_config"]["cls"]( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems too flexible, hard to read, and error-prone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, its horrible. I put it here to make a contrast
| # LICENSE file in the root directory of this source tree. | ||
| 
               | 
          ||
| """ | ||
| Dataclass config with inner Config classes. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we get free cli override with tyro?
https://github.com/pytorch/torchtitan/blob/2ea6197b957936bdd4941e59a000cf31987a3184/torchtitan/config/manager.py#L56
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that looks nice, didnt know about it
| cfg2.optimizer.lr = 1e-4 | ||
| optimizer_partial2 = instantiate(cfg2.optimizer) | ||
| optimizer2 = optimizer_partial2(params=model2.parameters()) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, perhaps there is a better way? Not sure if i can pass the params already in instantiate.
The spirit here is that i can actually do once:
instantiate(cfg), and everything gets instantiated already at the very start, but when i call cfg.optimizer, i will get a partial.
In this example it feels weird because instead of doing `instantiate(config)```, i am instantiating each arg of the config individually
| - Lazy instantiation via hydra.utils.instantiate | ||
| - Command-line override for free (--optimizer.lr=1e-4) | ||
| 
               | 
          ||
| Cons: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plus learning curve, compose, instantiate, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in practice, i think that people would only use instantiate, and we can even get rid of this and instantiate the whole config once at the start. Partials can then be called where needed. e.g.
def main(path_cfg:str):
	cfg = instantiate(load_cfg(path_cfg))
	
    model = cfg.model
	optimizer = cfg.optimizer(params=model.param)
| def llama3_2_1b_full(): | ||
| output_dir = "/tmp/torchtune/llama3_2_1B/full" | ||
| 
               | 
          ||
| return { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this one and the fiddle one, I guess you can take this to extreme and make everything a (data)class, e.g. TrainerWithConfig class can have model, optimizer, data loader, etc. What's your thought?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My issue with this pattern of TrainerWithConfig.config() is that its too opinionated and impacts the entire codebase. 3rd party utilities, or local code that dont need to be a class, must be represented as such.
Between the two, i personally prefer fiddle.
But when i read config_fiddle.py and i read baseline.yaml, to my eyes, baseline.yaml is easier to parse and understand. Perhaps because its a simple config? I can imagine cases where using python can be handy.
On the composability side, if you look at baseline_different_bsz.yaml, it seems easier to abstract away experimentation from infra args.
TDLR
- despite all of the hate, IMO .yaml + OmegaConf and/or Hydra is the lesser of all evils.
 - If we want to use .py, fiddle seems the easiest
 - dataclasses.config pattern are the safest, but impact the entire code base and is harder to read
 
This PR attempts to showcase the different styles we could use for our config system. A push here is if we can leave the .yaml and use .py instead, under the reasoning that "config is code".
Python can enable a) better type safety, b) import directly instead of using strings, and c) more flexibility for the user to define custom code within the config.
On the other hand, yaml can be easier to read and hydra provides some nice freebies.
Must have:
Nice to have:
Proposed options:
TODO: toml
I would like to hear what others think about the options, if we should stick with yaml and fully explore hydra/omegaconf or change to .py system.